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We present an optimization scheme that employs a Genetic Algorithm (GA) to determine the 
properties of low-lying nucleon excitations within a realistic photo-pion production model based 
upon an efltective Lagrangian. We show that with this modern optimization technique it is possible 
to reliably assess the parameters of the resonances and the associated error bars as well as to identify 
weaknesses in the models. To illustrate the problems the optimization process may encounter, we 
provide results obtained for the nucleon resonances A(1230) and A(1700). The former can be easily 
isolated and thus has been studied in depth, while the latter is not as well known experimentally. 
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I. INTRODUCTION 

In recent years, in order to study the properties of 
low-lying nucleon resonances and assess their parame- 
ters (masses, widths, and electromagnetic coupling con- 
stants), significant experimental and theoretical efforts 
have been devoted to the process of meson production 
from the nucleon, which is achieved by exciting the 
nucleon resonances by means of photonic or electronic 
probes, and to the study of the decays of these reso- 
nances into mesons (mainly pions) The parameters 
of these resonances, predicted by several theoretical mod- 
els of baryons - lattice Quantum Chromodynamics 
Skyrme models Q, quark models [1| - can be compared 
to the ones extracted from experimental data, which usu- 
ally requires the aid of reaction models. This process of 
extracting the nucleon excitations parameters from ex- 
perimental data is thus a crucial requirement in order to 
validate different hadron models, as it provides a guide 
for improving hadron models and for identifying the most 
reliable ones @. Together with pion scattering off the 
nucleon, single pion photoproduction is the most suit- 
able process for studying the low-lying baryon spectrurn. 
In fact, in recent years the experimental database [6i] 
has increased considerably and many experimental pro- 
grams have been run at different facilities such as LEGS 
(Brookhaven) [7] and MAMI (Mainz) [8,]. 

The extraction of the parameters of the resonances by 
means of a comparison of the results of reaction models 
to experimental data is an excellent example of a highly 
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involved optimization task. Problems in which a set of 
parameters must be established through comparison with 
experimental data are ubiquitous in physics. Often, op- 
timization has been considered a minor topic (at times 
even trivial) by the particle and nuclear physics com- 
munity which has relied on gradient-based optimization 
tools such as MINUIT Sometimes, however, op- 

timization problems are very complicated and gradient- 
based routines alone are not sufficient for the purpose, 
because the function to fit presents a complex structure 
with many local optima in which the codes get trapped 
before reaching anywhere near the desired absolute op- 
timum. Thus, until relatively recently, fitting model pa- 
rameters to data has been a kind of art. This was par- 
ticularly the case when thousands of data needed to be 
compared to the results of sophisticated models that de- 
pended on more than just a few parameters. In such 
cases, many instances of the optimization procedure have 
to be repeated, after manually adjusting the parameters, 
and specific care must be taken to prevent the optimiza- 
tion procedure from getting stuck at the many possible 
local minima positions. 

Recently, in nuclear and particle physics, more credit 
is being given to modern optimization procedures |T2, 
[fllisl Hg} and to error estimations on the parameters 
stemming from the fits. Modern and sophisticated opti- 
mization techniques such as simulated annealing (l7| and 
genetic algorithms (GA) |18i have been developed over 
the last twenty years and have been applied to problems 
which are impossible to tackle with conventional tools. 

In this paper we present a hybrid optimization proce- 
dure which combines a GA with a gradient-based ("hill- 
chmbing") routine E04FCF from the NAG library 0. The 
GA performs the bulk of the optimization efforts, ensur- 
ing that the parameter space is fully surveyed and lo- 
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cal minima are avoided, while the conventional gradient- 
based routine, when applied to the preliminary minima 
found by the GA, provides fine-tuning and speeds up 
convergence. We have applied this tool to a complex, 
multi-parametric optimization problem, namely the de- 
termination of nucleon resonances parameters by com- 
paring the results of a realistic model for the photo-pion 
production reaction to data. As a by-product, the opti- 
mization procedure provides insight into the reliability of 
the values (error bars) of the parameters extracted and 
information on their physical significance. 

This paper is organized as follows: in Section |TT] we 
briefly present the model for pion photoproduction on 
free nucleons from threshold up to 1.2 GeV developed in 
Refs. [U, [3, [3- In Section mil we present the strategy 
applied to solve the problem. In Section IIVI we present 
the GA in detail. In Section |V] we show the results ob- 
tained by the algorithm, analyze its performance and 
comment on the error bar estimates and the physical sig- 
nificance of the parameters extracted. Finally, in Section 
IVII we present our conclusions. 



II. THE REACTION MODEL 

The reaction model is based upon a phenomcnologi- 
cal Lagrangian and it allows us to isolate the contribu- 
tion of the resonances, calculate their bare properties, 
and compare these p rop erties with the values provided 
by nuclconic models [l3, [3 ■ In addition to Born terms 
(those which involve only photons, nucleons and pions) 
and vector-meson exchange terms {p and uj), the model 
includes all four star resonances quoted in the Particle 
Data Group (PDG) ^ up to 1.8 GeV of mass and 
up to spin-3/2: A(1232), N(1440), N(1520), A(1620), 
N(1650), A(1700), and N(1720). The internal structure 
of the nucleonic excitations shows up in the values of 
the electromagnetic coupling constants that appear in 
the Lagrangian. The model displays chiral symmetry, 
gauge invariance, and crossing symmetry, and incorpo- 
rates a consistent treatment of the interaction with spin- 
3/2 particles that avoids well-known pathologies of pre- 
vious models [H, [i3,[lH- Furthermore, the dressing of 
the resonances is considered by means of a phenomeno- 
logical width which takes into account decays into one 
and two tt's and one 77. This width is included in a way 
that fulfils crossing symmetry and thus it contributes to 
both the direct and crossed channels of the resonances. 
We assume that the final state interactions (FSI) in the 
■kN rescattering factorize and can be included through 
the distortion of the ttN final state wave function. We 
include this distortion in a phcnomcnological way by in- 
corporating a phase (5fsi to the electromagnetic multi- 
poles. We fix this phase so that the total phase of the 
electromagnetic multipole is identical to that of the en- 
ergy dependent solution of SAID In this way, we 
disentangle the parameters of the electromagnetic vertex 
from the FSI effects. 



III. MINIMIZATION STRATEGY 

Our minimization procedure follows the one in [l^ 
although we use a more sophisticated GA and employ 
the E04FCF routine from the NAG library Q instead of 
MINUIT [1^ code. We apply the minimization scheme 
to a realistic meson production model and the aim of 
our minimization is different. While in [T^ the aim was 
to establish the existence of certain resonances, in this 
paper our goal is to determine the parameters of well- 
established nucleon resonances and to obtain estimates 
on the reliability of these parameters and their associ- 
ated error bars. 

The function to minimize is the defined by 
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where M^'-^p stands for the current energy independent 
extraction of the multipole analysis of SAID up to 1.2 
GeV for Eq+, Mi_, Ei+, Mi+, E2-, and A/2- multi- 
poles in the three isospin channels / — n for the 
7P TT^p process @. AAI^^^ is the experimental error 
and Al"' is the multipole value given by the model. It 
depends on parameters Ai, ... ,A„. We have taken into 
account 1,880 data for the real part of the multipoles and 
the same amount for the imaginary part. Thus, 3,760 
data points have been used in the fits. Unlike cross- 
sections or asymmetries, electromagnetic multipoles are 
not directly measured quantities and some elaboration 
of the raw experimental data is needed to obtain these 
multipoles. However, we have chosen, as it is very often 
done in this field, to fit electromagnetic multipoles in- 
stead of other observables. Several reasons are mentioned 
when fitting to multipoles. On one hand, electromagnetic 
multipoles are more sensitive to coupling properties than 
other observables, so deficiencies in the model may show 
up more clearly. The second reason is that, in principle, 
all the observables can be expressed in terms of multi- 
poles. Thus, if the multipoles are properly fitted by the 
model, so should be other observables. 

In order to determine the resonance parameters that 
best fit the data, we have written a hybrid optimization 
code based on a GA combined with the E04FCF routine 
from the NAG library [|. Although GA, are compu- 
tationally more expensive than other algorithms, in a 
minimization problem it is much less likely for them to 
get stuck at local minima than it is for other methods, 
namely gradient-based minimization methods. GAs al- 
low us to explore a large parameter space more efficiently. 
Thus, in a multi-parameter minimization such as the one 
we face here, they are probably a very efficient way of 
searching for the best minimum. In the next section we 
will go through the details of the GA. 

The parameters for the model (Xj) are divided into two 
different kinds: (i) Those that are obtained from models 
or experiments other than pion photoproduction, namely 
vector-meson coupling constants (three parameters) and 
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masses and widths of the nucleon resonances (fourteen 
parameters, one mass and one width for each resonance 
which have been taken from ^^), and (ii) those that are 
extracted from pion photoproduction data, namely elec- 
tromagnetic coupling constants (fifteen parameters) and 
the cutoff A for Born terms and vector-meson exchanges. 
We have allowed the algorithm to vary all the parameters 
(see Tables U and in]). However, the parameters in the first 
group have been varied within a very small range, the 
experimentally allowed values for the vector-meson cou- 
pling constants and ±2 MeV for the masses and widths 
of the nucleon resonances. The reason for allowing these 
parameters to vary, even though the range of variation is 
minimal, is to make room for the algorithm to search for 
the global minimum and to take into account the error 
bars for these parameters into the possible solution. This 
should help to prevent the algorithm for being trapped 
in local minima. 

The variation range for the second group of parameters 
are chosen to explore a large region of parameter space. 
Hence we avoid introducing prejudgments on their val- 
ues based on previous analysis. We prefer to use the 
hclicity amplitudes (for their definition and connection 
with coupling constants see Refs. [H, [13, H^]) to de- 
fine the ranges, instead of the electromagnetic coupling 
constants. We allow them to vary in the range [—1.1] 
GeV-i/2. 

The cut-off A is included in the form factors that multi- 
ply the Born terms and vector-meson exchange invariant 
amphtudes. We use the form factors suggested in [23j . 
which respect gauge invariance and crossing symmetry. 
For these Born terms 

Fb{s, u, t) =F{s) + F{u) + G{t) - F{s)F{u) 

- F{s)G{t) - F{u)G{t) + F{s)F{u)G{t), 

(2) 
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where 



F{1) = \ + {l-M^f /K^ 
G{t) ^ \l + {t-mlf/K^ 



I = 



(3) 
(4) 



and s, u, and t are the Mandelstam variables. For vector 
mesons, we adopt Fv{i) — G{t) with the change m^r — > 
mv- To reduce the number of free parameters for the 
model we use the same A for both vector mesons and 
Born terms. 

The form factors take non-resolved structure effects 
and higher order terms in the scattering matrix expansion 
into account. Thus, the cut-off A is related to the energy 
scale of the effective theory and the sensible values for A 
should be of the order of the nucleon mass (actually, in 
our best fit we obtain A = 0.943 MeV). For this reason, 
in the minimization process, we restrict A to the range 
[0.1,2.0] GeV. 

In order to perform the minimization, the range of vari- 
ation of each parameter is mapped into the [0, 1] interval 
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FIG. 1: Example of the evolution for a champion in one run of 
the GA. For the first generations (up to generation 40 or 50) 
evolution is driven by crossover. After this, small improve- 
ments are seen due to mutations. 



for the GA and into (— oo, -|-oo) for the E04FCF routine. 
This latter step is done by means of the transformation 
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where \j is the model parameter, Xj is the mapping of Xj 
into (— oo,+oo), is the highest value of the range 

of variation, and AJ"" is the lowest value. With regard 
to the range of variation allowed for the parameters, we 
must note that gradient routines work more efficiently 
if variations of similar magnitude on each of the search 
parameters introduce a similar variation on the function 
to minimize. The E04FCF user is advised to explore the 
region of parameters to minimize and to provide ade- 
quate rescaling of the problem before calling the routine. 
While the NAG library provides tools that help in this 
task, in our combined algorithm we take advantage of 
the knowledge obtained on the variation of the objec- 
tive function during the previous evaluations performed 
by the GA. We use this exploration to normalize the 
to unity and to rescale all the parameters affecting this 
function so that, according to the last evaluations of the 
best individuals explored by the GA, after rescaling of 
both the parameters and the function to optimize, the 
region explored by the NAG E04FCF routine in its search 
for the minima is expected to lie in a hypercube of unit 
volume. We have indeed verified that this normalization 
and rescaling procedure improves NAG routine perfor- 
mance. 

Our minimization strategy includes the following as- 
pects: 

1. A first generation is made out of individuals ran- 
domly generated within reasonable values of the 
parameters. 

2. Next, the GA is run for 400 generations (see defi- 
nition further on). This number is determined af- 
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TABLE I; Ranges for the parameter values of the nucleon 
resonances. Masses and decay widths have been taken within 
the ranges provided by [S^]. The helicity ampUtudes are de- 
noted by Aj^ , where I stands for isospin and A for the helicity 
of the initial photon-nucleon state. 





M* (GeV) 


r (GeV) 


Ai (GeV-i/2) 


A(1232) 


[1.215,1.219] 


[0.094,0.098] 


[-1,1] 


N(1440) 


[1.381,1.385] 


[0.314,0.318] 


[-1,1] 


N(1520) 


[1.502,1.506] 


[0.110,0.114] 


[-1,1] 


N(1535) 


[1.523,1.527] 


[0.100,0.104] 


[-1,1] 


A(1620) 


[1.605,1.609] 


[0.146,0.150] 


[-1,1] 


N(1650) 


[1.661,1.665] 


[0.238,0.242] 


[-1,1] 


A(1700) 


[1.724,1.728] 


[0.116,0.120] 


[-1,1] 


N(1720) 


[1.740,1.755] 


[0.119,0.278] 


[-1,1] 



TABLE IL Ranges for the values of the parameters of vector 
mesons and cut-off A. 



A (GeV) 



[20.61,21.11] 
[-0.17, -0.15] 
[6.1,6.3] 
[0.1,2.0] 
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ter inspecting the best individual evolution for each 
generation and from comparisons with benchmark 
problems of similar size. We do not really need that 
many as 400 generations (see Fig. [IJ, but we pre- 
ferred to let the algorithm run for more generations 
than necessary in order to ensure that convergence 
was achieved. 

3. After the 400 generations have been run, we intro- 
duce the GA solution as the initial value for the 
E04FCF routine from NAG libraries [1]. We use 
the routine for fine-timing. The E04FCF routine 
implements an algorithm that looks for the uncon- 
strained minimum of a sum of squares 



Minimize 



E 



\fj (xi, . . .,x„) f , (6) 



Reached final generation? 



Yes 



Solution 



FIG. 2: GA scheme (see text in section ITV] 



of m nonlinear functions in n variables (m > n). 
This algorithm does not require the derivatives to 
be known. From a starting point x^^-* , • • ■ , (in 
our case supplied by the GA) the routine applies 
a Quasi-Newton method in order to find the mini- 
mum. This method uses a finite-difference approx- 
imation to the Hessian matrix to define the search 
direction. It is a very accurate and fast converging 
algorithm once we have an initial solution that is 
close to the minimum we seek. Therefore, it is well 
suited for our fine-tuning purpose. 

We note that many attempts to solve our op- 
timization procedure solely by means of E04FCF 
completely failed, even when we guided the initial 
ranges of the parameters by hand. The NAG rou- 
tine got stuck in the first local minimum, usually 
very far from the one obtained by the GA. 

4. We store the solution obtained by the combined 
algorithm and we start again, by generating a dif- 
ferent random seed for the initial population of the 
GA. After running the minimization code twenty 
times, we obtain twenty different minima. If we 
find that all the divided by Xmin (*he minimum 
among all the fits) are close to unity, we stop 
the fitting procedure. 



IV. GENETIC ALGORITHM 

Genetic Algorithms are a specific kind of stochastic 
optimization methods based upon the idea of evolution. 
There are many excellent textbooks on GA [l^. Here 
we will describe the main features of GA that are needed 
to understand our implementation. GAs encode the pos- 
sible solutions to the proposed problem and deal with 
many of these solutions at the same time. Indeed, a 
set of these possible solutions (also called individuals or 
genes) form a population. Each individual in the popula- 
tion is classified according to its fitness value, computed 
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FIG. 3: Examples of the fits obtained to tfie electromagnetic 
multipoles for the reaction 7p — ^ vr^p. Curve conventions: 
Solid: Real part of the electromagnetic multipole; Dashed: 
Imaginary part of the electromagnetic multipole. Data are 
from Rcf. Igl. 



in terms of some objective function related to our op- 
timization problem. In our case, the individual encodes 
the parameters of the Lagrangian and the objective func- 
tion is essentially the of the multipole values compared 
to the prediction of each Lagrangian represented in the 
population. GAs implement operators such as crossing 
among individuals and mutation j2^. As long as both 
the encoding of the problem and the GA operators ex- 
hibit good schema properties (that means that the off- 
spring obtained after breeding two or more individuals 
with some good properties in terms of fitness are, more 
often than not, more fit than any of their parents), the 
evolution or repeated application of the genetic opera- 
tors on the population, combined with a mechanism of 
natural selection (survival of the fittest), would cause 
some individuals to accumulate the good properties (sub- 
schema) initially distributed among different individuals 
in the early population. Provided that the number of 
individuals in the population is large enough for many 
good sub-schemas to be represented in at least some in- 
dividual of the population, then the GAs would evolve 
toward very fit individuals, that is, good solutions to the 



problem. In this work, the obvious sub-schema are the 
parameters of each resonance and a simple encoding in 
which every individual is composed of a set of possible 
values for the parameters of our Lagrangian, would do 
the job. We encode the possible solutions to the problem 
(i.e., a complete set of parameters for the Lagrangian) as 
a series of integer numbers within the range from to 
a maximum value N. For each parameter, this integer 
number represents the value of said parameter within the 
range desired by the user. For instance, a stored value 
of would indicate that the value of the corresponding 
parameter equals the minimum allowed within the range. 
Conversely, the maximum value N would represent the 
stored parameter reaching the maximum allowed within 
the range. We denote this maximum value of these in- 
tegers N as the granularity of our encoding strategy. A 
large value of N implies a very thin granularity, that is, 
relatively small changes in each parameter are possible 
in our encoding strategy and individuals that are very 
similar in terms of the parameters they represent and 
consequently, in their fitness, can be encoded. On the 
other hand, if we want to sample the parameter space 
with reasonable density, a too thin granularity would re- 
quire a very large number of individuals. As we have just 
mentioned, an important choice to make for every GA 
is the number of individuals in the population. When 
the population size increases, the chances for relatively 
less fit individuals of mating with other individuals and 
generating better offspring before disappearing from the 
population, decrease exponentially. We must realize that 
even the less fit individuals (some of them) may have 
good sub-schema needed to encoded the best solution. 
Some of these sub-schema may not be present in other 
more fit individuals in the population, at least during the 
early stages of the evolution. According to results of tests 
with our Lagrangian as well as benchmarks with other 
functions that are easy to compute and have well known 
minima, we have determined that the maximum number 
of individuals we may safely employ in a population for 
our GA is around 400. For this size of the population, 
granularity values from 100 to 1000 have been employed 
in our GA without problems. 

In what remains of this section we simply provide a 
detailed explanation on how the GA we have programed 
works. Our GA proceeds as follows [H 

1. Initial population. We provide a first generation 
consisting of individuals (400 in our calculations) 
that are randomly generated with reasonable values 
of the parameters [26|. 

2. Selection scheme. The genetic algorithm we use 
employs a scaled selection scheme and employs the 
elitist model In this model, the best individ- 
ual (or champion) from the previous generation is 
always included in the current population, ensuring 
that the best solution this far is preserved. This de- 
creases significantly the time the GA takes to find 
an acceptable solution. It has been proved [24|] that 



the GA which introduces elitism (that is, the guar- 
anteed survival of the champion at every step of 
the GA evolution) will eventually converge to the 
absolute optimum, while, in general, the ones that 
do not protect the champion will never reach the 
optimum [27[. 

With regard to the remainder of the population, be- 
sides the champion, the individuals from the previ- 
ous generation (that is, the population in its earlier 
state) are ranked according to the fitness function, 
in our case the value. After this step, we in- 
troduce scaling of the population [2^ determining 
the probability that an individual has to mate and 
survive. We provide a 0.8 probability to the worst 
individual and 1.0 to the best one. This is done 
in order to maintain genetic diversity. Indeed, it is 
necessary to prevent that the best and the worst in- 
dividuals have a too different survival probability. 
If we do not take care to preserve genetic diversity 
in this way, the appearance of a very fit individ- 
ual would make the forthcoming offspring collapse 
to the characteristics of that particularly fit indi- 
vidual too soon. Another important technique to 
maintain diversity is mutation, which is discussed 
further on. 

3. After scaling, we classify the population into two 
sets. Set (a) is composed of the best 25% of the 
individuals and set (b) by the remaining 75%. We 
produce the new generation in the following way: 

• 25% of the individuals are taken from the most 
fit ones from the previous generation. That is, 
set (a) is copied into the next generation. 

• Another 25% is selected through a fight among 
all the individuals (tournament). The out- 
come of the fight is randomly decided, depend- 
ing on probability. Even in the least favorable 
case (that is, if the worse individual fights with 
the best one), the winning probability of (the 
worst) individual is 15%. Winning probabili- 
ties are computed accordingly to the fitness of 
each contender. 

• Another 25% is obtained by means of half- 
elitist crossover. This means that we mate an 
individual from the best 25% of the previous 
generation (set (a)) with any other individual 
in sets (a) or (b). Both individuals are picked 
randomly from their respective sets. 

• The remaining 25% of the offspring are gen- 
erated by mating individuals that are selected 
randomly without restrictions from sets (a) or 
(b). 

We apply two different kinds of crossover: one 
point crossover and arithmetic crossover (28j . In 
one point crossover, a random crossover point for 
both parents is selected. We split each chromosome 
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from the parents into two pieces. We take the sec- 
ond piece of the second parent and attach it to the 
first piece of the first parent. In this way we obtain 
an individual that is a mixture of the two origi- 
nal ones. For the arithmetic crossover, we choose 
at random a number r between and 1, and the 
offspring is calculated weighting the parents with 
weight r and (1 — r). 

^offspring _ ^ ^ ^parent 1 ^ (^^ — r) • ^yP^''^"* ^ (J) 

Half of the crossovers our GA implements are one 
point and the other half are arithmetic. The kind 
of crossover to apply to a given pair of parents is 
chosen at random. 

4. We evaluate the new population and identify the 
new champion. As previously mentioned, it will 
be preserved (elitism). We select other individuals 
to mutate from the rest of the population exclud- 
ing the champion. Indeed, in each iteration of our 
GA we introduce as many mutations as the number 
of individuals in the population divided by three. 
These mutations are distributed at random among 
all the individuals (excluding the champion) of the 
population generated following the previous steps. 
We apply two types of mutation '2^. The per- 
mutation mutation exchanges two parameters se- 
lected at random. The gaussian mutation changes 
the value of a parameter by a small amount. The 
amount of change induced by this mutation is ran- 
dom within a small range. The reason to introduce 
mutations is that, quite often, the crossover op- 
erator and the selection method are too effective 
and they end up driving the GA toward a pop- 
ulation of individuals that are almost exactly the 
same. When the population consists of similar in- 
dividuals, the likelihood of finding new solutions 
typically decreases. The mutation operator intro- 
duces an additional randomness into the search. It 
helps to maintain diversity and to find solutions 
that crossover alone might not discover. 

5. After these steps are taken, we say that a new gen- 
eration is built. If we have not reached the limit 
in the number of generations, we run the algorithm 
again with the current set of individuals as the ini- 
tial population. 

When the maximum number of generations has 
been reached, we take the set of parameters en- 
coded by the champion as the solution given by 
GA to our problem. If sufficient generations have 
been run, most of the individuals will have close 
values for the fitness function. 

It has been proven that there is no optimal algorithm 
that adapts well (that is, reaches a solution in the least 
number of evaluations) to all kind of problems. This is 



TABLE III: Helicity amplitudes obtained in the fits in 
GeV-i/^ 





AV 
^1/2 


^1/2 


^1/2 


AP 

^3/2 


^3/2 


A ^ 
^3/2 


A(1232) 




-0.120 






-0.229 




N(1440) 


0.060 




-0.089 








N(1520) 


-0.007 




0.032 


0.107 




-0.085 


N(1535) 


0.014 




-0.137 








A(1620) 




-0.023 










N(1650) 


-0.022 




0.003 








A(1700) 




0.139 






-0.127 




N(1720) 


0.143 




0.126 


-0.004 




-0.444 



often referred to as the no free lunch theorem in optimiza- 
tion ^3Q] . Our goal here however is not to find the opti- 
mal algorithm that obtains the minimum to our problem 
in less evaluations but, rather, to develop a general tool 
that can be applied to many different models of param- 
eter data fitting without specific fine-tuning nor human 
intervention, even if the performance of the tool is sub- 
optimal in terms of the number of operations. In this 
regard, GAs are a handy choice, as they are suitable for 
many different problems. Thanks to scaling and elitism, 
our GA converges neither too quickly nor too slowly and 
generally it is able to find good candidates for the global 
optimum. 

When the individuals are very fit, it can be hard for 
the GA to evolve further, mainly because the path to 
the best individual may involve two or more consecutive 
mutations where each of these mutations on their own 
will produce a less fit individual that will sooner be re- 
moved from the population. The occurrence of such two 
favorable mutations in the same individual is unlikely 
and tailored procedures must be implemented to intro- 
duce specific mutations that are adequate for particular 
problems, or more complex operators like the 'tunnel- 
ing algorithm' [3l| or complex rules to encode the values 
of the functions, like Mendelian operators implementing 
a non-dominant character for some genes [32| . In our 
work, however, we prefer to employ a hybrid optimiza- 
tion method that combines a standard hill-climbing al- 
gorithm with a GA. Hybrid optimization methods have 
been under study intensively [l^. [ssj . We have compared 
several ways of hybridizing GAs and conventional gradi- 
ent based hill-climbing algorithms, such as introducing 
the hill-climbing algorithm as another mutation opera- 
tor. However, we have noticed that this will only make 
the GA converge sooner, very often too soon, resulting in 
it getting stuck at any of the many local minima. From 
our experience, if the hill-climbing procedure is intro- 
duced just at the end of the evolution, when the GA has 
converged, the best results are achieved and a robust al- 
gorithm that requires no human intervention is this way 
configured. Also, no granularity is introduced in this fi- 
nal step of optimization. Indeed, the NAG routine is not 
restricted to integer values of the parameters, but instead 
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FIG. 4: Many local minima and the effect of the fine tun- 
ing performed by the E04FCF routine in the x^/Xmin are 
shown. Conventions: Open circles, /Xmin obtained by the 
GA alone (400 generations with 300 individuals each); Solid 
squares: X^/Xmin obtained by the GA plus the NAG routine. 

represents each parameter as floating point values. Thus, 
we can also consider that the GA finds the best optimum 
that can be represented within the grid implied by the 
granularity N, and starting from this point of the grid, 
the NAG routine refines a search not bound to any grid 
values. 



V. RESULTS 

In Fig. [3] we show examples of fits to electromagnetic 
multipoles for the — > 7r°p process and the overall 
agreement obtained. The values of the parameters are 
summarized in Table lTIIl In Fig. [l]we display an example 
of the evolution of the champion along the generations. 
Two hundred generations are sufficient enough to achieve 
convergence, but we run the algorithm for another two 
hundred generations to see the effects of mutations, which 
can reach areas of the parameter space that are not being 
fully surveyed by means of crossover. 

We observe that at the early stages of the evolution the 
fitness function improves quickly, as crossover works to 
concentrate the good schema from other individuals into 
a good individual. Actually, a very steep slope in this 
region might indicate that evolution is too fast and that 
less fit individuals could disappear from the population 
before their good properties are transmitted to more fit 
individuals. 

When a jump in the X^/Xmin happens, it is due to 
the appearance of a more fit new individual, either due 
to crossover or to mutation. In Fig. 3] we can verify 
the existence of many local minima (so this is certainly 
an ill-posed optimization problem) and the fine tuning 
achieved by the NAG routine which improves minima by 
approximately 2%. 

An important issue to consider in GAs is efhciency. 
As we have already mentioned, the parameter space has 
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FIG. 5: Helicity amplitudes (equivalent to the coupling constants of the Lagrangians) of the A(1232). In all the figures we 
show the twenty minima obtained in the full minimization procedure (GA-fNAG, see Fig. |4]). The upper- left hand figure (a) 
shows the x^/Xmin versus the amplitude vlf^j- The lower-left hand figure (b) shows the x^/x^min versus the amplitude Ay2- 
The right panel (c) shows versus parameters. 



to be discretized with a certain granularity and the algo- 
rithm searches for the best solution within the discretized 
version of the parameter space. The size of this space 
significantly affects the efficiency of the algorithm, thus 
a balance between granularity and computing time has 
to be achieved. The gradient based routine allows us to 
gain precision and efficiency because we do not need the 
GA to reach the minimum, we simply need it to provide 
a value close enough for the E04FCF routine can reach 
it. In other words the GA has to reach the region where 
the minimum lies, and once in this region, reaching the 
minimum is a task for the gradient-based routine. 

We must emphasize that the use of our algorithm is 
unattended. That is, we submit the script that starts 
20 instances of the GA-I-NAG procedure, and after the 
equivalent to five CPU-days (Opteron, 2 GHz), we get 
the results for the optimized set of parameters. No fur- 
ther human intervention was needed to choose initial val- 
ues of the parameters or to guide the evolution. While 
the GA-t-NAG may require more (costly) evaluations of 
the objective function, it is robust and needs no train- 
ing nor good guesses of the initial parameters. Now that 
computer power seems to be an increasingly available 
resource, the unattended mode of operation makes this 
hybrid algorithm a very interesting alternative for these 
optimization problems. 

Figs. [5] and [S] show a typical situation that may arise 



when the parameters are being determined. For A(1232) 
the minimum is well-established and all the minima are 
constrained in a small region. The size of the region 
where the minima lie may provide a better estimation 
of the error associated with the parameters than the one 
provided by the correlation matrix. On the other hand, in 
Fig. Othe value for the A^^^ helicity amplitude appears 
to be in one of two split regions that are too close to be 
physically distinguished (left-upper panel). One region 
is centered at -0.120 and the other at -0.119 GeV"^/^. 
The identification of these regions is one of the functional- 
ities that GAs provide and one of their main advantages. 
When multiple regions containing minima of similar qual- 
ity appear, the possible physical implications should be 
considered and further analysis to assess whether these 
different regions hold physical meaning (see subsection 
IV A|) is required. 

We also show the minima of the A (1700) that are con- 
strained in just one region. However, this region is larger 
than for the A(1232) and the experimental information 
available for this resonance, thus, yields parameters that 
are not as well established as for other nucleon excita- 
tions. 

The evolution of the position of the parameters for dif- 
ferent instances of the GA-I-NAG procedure as the num- 
ber of generations employed in the GA increases, is shown 
for the A (1700) resonance in Figs. [7] and [5] We can ob- 
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serve how the region of the minimum decreases while the 
GA evolves. For the 50 generations plus NAG run (as- 
terisks in Figs. [7]and[8|) the is far away from its best 
value. This case exemplifies what happens when the pa- 
rameters are assessed using the E04FCF after the GA had 
not been converged and therefore the GA is merely pro- 
viding 'very smart guesses', for the starting point of the 
gradient based routine. We observe in this case that the 
results spread over a wide range of values of the param- 
eters, showing that indeed this is a hard optimization 
problem. Indeed, we expect that the starting values pro- 
vided by unconverged instances of the GA are in fact 
much better that the ones we may figure without the 
aid of the GA. It is clear that to reach even an average 
quality optimum would be extremely hard (if not impos- 
sible) without the GA phase of our algorithm. After 150 
generations plus NAG (open squares in Figs. [7| and [8|) 
the result looks much better, showing a region where the 
values of the parameters are well delimited. The is 
remarkably better and close to the best values obtained 
after 400 generations plus NAG (solid circles in Figs. [6l 
H and El). 



A. Minima Split in Various Regions 

The amount and quality of data is of great importance 
in assessing the parameters of any model. The pion pho- 
toproduction multipole data set employed for the fits in 
this work is the largest and of the highest quality ever 
available. It was released in 2006 and includes 3,760 
data points. It is interesting to see what would hap- 
pen if we employ the 2005 SAID database instead, which 
includes up to 1.0 GeV photon energy and considerably 
fewer (1526) data points, as done in Refs. [13, 14, 19]. We 
find that the results change for the not so well-determined 
resonances as is the case of the A(1700) one. We find in 
this case several minima lying in more than one region. 
Fig. [5] is equivalent to Fig. [5] but in this case fitting to 
the former data set. It becomes apparent how the min- 
ima split into two distinct regions. Fig. [TO] is equivalent 
to Fig. [8] and shows how the regions are formed as the 
algorithm evolves. It also shows that a gradient method 
alone leads the optimization to incorrect answers most 
of the time. There are several possible reasons for the 
appearance of this minima structure. For instance, this 
can be caused by deficiencies in the model or in the data. 
We must keep in mind however, that this result can even 
have a physical meaning such as a possible shape coexis- 
tence for a state that can fit the data equally well for two 
sets of parameter values. This would have to be stud- 
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after 400 generations plus NAG. 



ied within a model in which the resonance is included as 
a combination of both states and re-fit to experimental 
data. However, it seems that this is not the situation we 
encountered here. The results presented in the previous 
subsection and in Fig. [H] clearly indicate that improving 
the database and extending the model to higher energies 
(which allows one to account for the tail of the A(1700) 
resonance) are sufficient to collapse the two regions 
into one single region. 
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kind of multi-parameter optimizations when the param- 
eter space is large and the function to fit presents many 
local minima. The assessment of the low-lying resonances 
properties by means of reaction models is an example of 
a very difficul t op timization problem for conventional al- 
gorithms [il[il[il. 

The hybrid optimization procedure presented in this 
paper is a powerful and versatile optimization tool that 
can be applied to many problems in physics that involve 
the determination of a set of parameters from data. It is 
a promising method for extracting both reliable physical 
parameters as well as their confidence intervals. Indeed, 
computing correlations among different parameters by 
comparing different solutions obtained by the hybrid op- 
timization method, in a manner similar to what is shown 
in the panel on the right in Fig. 6, is probably more 
meaningful than the simple covariance matrices returned 
by gradient based optimization routines. 

Finally, we have shown how we can use the procedure 
we have outlined to identify weaknesses in the model and 
assess the reliability of the parameters obtained. Not 
only the error bars have to be considered when quoting 
the uncertainty in the determination of a parameter, but 
also whether the minima are concentrated into one single 
region or split into several ones, and the possible physical 
explanations of such situation. 



VI. FINAL REMARKS 

We have presented a hybrid optimization procedure 
which combines a GA with the gradient-based routine 
E04FCF from the NAG libraries. We have successfully 
applied this algorithm to determine the coupling con- 
stants of the low-lying nucleon resonances within a real- 
istic Lagrangian model of the pion photoproduction re- 
action. The results for the couplings were summarized in 
Table Uni 

Traditional optimization tools are often useless for this 
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